Dynamic Grammars with Lookahead Composition for WFST-based Speech Recognition

نویسندگان

  • Josef R. Novak
  • Nobuaki Minematsu
  • Keikichi Hirose
چکیده

Automatic Speech Recognition (ASR) applications often employ a mixture of static and dynamic grammar components, and can thus benefit from the ability to efficiently modify the system vocabulary and other parameters in an on-line mode. This paper presents a novel, generic approach to dynamic grammar handling in the context of the Weighted Finite-State Transducer (WFST) paradigm. The method relies on a straightforward extension of the lexicon and underlying grammar components, and leverages the ideas of on-the-fly composition and delayed construction to efficiently generate the recognition search space on-the-fly. The alternative partitioning of component models that this approach implies can also result in significant storage savings. In contrast to previous works in this area, the proposed method relies only on generic WFST operations and the context-dependency, lexicon and grammar components that form the basis of standard ASR cascades.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Specialized WFST Approach for Class Models and Dynamic Vocabulary

In this paper we describe a specialized Weighted Finite State Transducer (WFST) framework for handling class language models and dynamic vocabulary in automatic speech recognition. The proposed framework has several important features, a fused composition algorithm that substantially reduces the memory usage in comparison to generic WFST operations, and an efficient dynamic vocabulary scheme th...

متن کامل

Generalized fast on-the-fly composition algorithm for WFST-based speech recognition

This paper describes a Generalized Fast On-the-fly Composition (GFOC) algorithm for Weighted Finite-State Transducers (WFSTs) in speech recognition. We already proposed the original version of GFOC, which yields fast and memory-efficient decoding using two WFSTs. GFOC enables fast on-the-fly composition of three or more WFSTs during decoding. In many cases, it is actually difficult or impossibl...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

Generalized Fast On-the-fly Com WFST-Based Speech R

This paper describes a Generalized Fast On-the-fly Composition (GFOC) algorithm for Weighted Finite-State Transducers (WFSTs) in speech recognition. We already proposed the original version of GFOC, which yields fast and memory-efficient decoding using two WFSTs. GFOC enables fast on-the-fly composition of three or more WFSTs during decoding. In many cases, it is actually difficult or impossibl...

متن کامل

Improved Semi-dynamic Network

In this paper, we present an improved semi-dynamic network decoding strategy by incorporating weighted finite-state transducer (WFST)-based search network. In our approach, a static search network is first optimized by applying WFST algorithms (determinization and minimization) to the composition of a lexicon and a language model. Then the WFST is partitioned into a set of subnetworks according...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012